A Permutation Approach to Validation
نویسندگان
چکیده
We give a permutation approach to validation (estimation of out-sample error). One typical use of validation is model selection. We establish the legitimacy of the proposed permutation complexity by proving a uniform bound on the out-sample error, similar to a VC-style bound. We extensively demonstrate this approach experimentally on synthetic data, standard data sets from the UCI-repository, and a novel diffusion data set. The out-of-sample error estimates are comparable to cross validation (CV); yet, the method is more efficient and robust, being less susceptible to overfitting during model selection.
منابع مشابه
Cross-validation in high-dimensional spaces: a lifeline for least-squares models and multi-class LDA
Least-squares models such as linear regression and Linear Discriminant Analysis (LDA) are amongst the most popular statistical learning techniques. However, since their computation time increases cubically with the number of features, they are inefficient in high-dimensional neuroimaging datasets. Fortunately, for k-fold cross-validation, an analytical approach has been developed that yields th...
متن کاملA Multi-objective Immune System for a New Bi-objective Permutation Flowshop Problem with Sequence-dependent Setup Times
We present a new mathematical model for a permutation flowshop scheduling problem with sequence-dependent setup times considering minimization of two objectives, namely makespan and weighted mean total earliness/tardiness. Only small-sized problems with up to 20 jobs can be solved by the proposed integer programming approach. Thus, an effective multi-objective immune system (MOIS) is ...
متن کاملPermutation Testing in Multivariate Regression Trees
Formal testing procedures are generally not available for determining which splits in a regression tree are significant. Such a procedure is presented using permutation testing applied to order statistics. Finally, traditional cross validation and the permutation testing procedure are compared in a specific example.
متن کاملLanguage Learning Strategies: A Strategy-Based Approach to L2 Learning.Strategic Competence, and Test Validation
متن کامل
Multivariate Analysis of fMRI using Fast Simultaneous Training of Generalized Linear Models (FaSTGLZ)
We present an efficient algorithm for simultaneously training elastic-net-regularized generalized linear models across many related problems, which may arise from bootstrapping, cross-validation and nonparametric permutation testing. Our approach leverages the redundancies across problems to obtain ≈ 10x computational improvements relative to solving the problems sequentially by the standard gl...
متن کامل